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Let 4>{G) be the minimum conductance of an undirected graph G, and let = Ai < A2 < . . . < A„ < 2 
be the eigenvalues of the normalized Laplacian matrix of G. We prove that for any graph G and any 
k > 2, 



and this performance guarantee is achieved by the spectral partitioning algorithm. This improves 
Cheeger's inequality, and the bound is optimal up to a constant factor for any k. Our result shows 
that the spectral partitioning algorithm is a constant factor approximation algorithm for finding a sparse 
cut if Afc is a constant for some constant k. This provides some theoretical justification to its empirical 
performance in image segmentation and clustering problems. We extend the analysis to other graph 
partitioning problems, including multi-way partition, balanced separator, and maximum cut. 
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1 Introduction 



We study the performance of spectral algorithms for graph partitioning problems. For the moment, we 
assume the graphs are unweighted and d-regular for simplicity, while the results in the paper hold for arbitrary 
weighted graphs, with suitable changes to the definitions. Let G — (V, E) be a d-regular undirected graph. 
The conductance of a subset S C V is defined as 



dmitt{|£|,|S|} 

where E(S, S) denotes the set of edges of G crossing from S to its complement. The conductance of the 
graph G is defined as 

0(G) = miners'). 
scv 

Finding a set of small conductance, also called a sparse cut, is an algorithmic problem that comes up 
in different areas of computer science. Some applications include image segmentation [SMOO, TM06], 
clustering [NJW01, KVV04, Lux07], community detection [LLM10], and designing approximation algo- 
rithms [Shm97]. 

A fundamental result in spectral graph theory provides a connection between the conductance of a graph 
and the second eigenvalue of its normalized Laplacian matrix. The normalized Laplacian matrix C G M 1/xl/ 
is defined as C = I — \A, where A is the adjacency matrix of G. The eigenvalues of £ satisfy = Ai < 
A2 < • • • < Aiyi < 2. It is a basic fact that 0(G) = if and only if A 2 = 0. Cheeger's inequality for graphs 
provides a quantitative generalization of this fact: 

\\2 < ^(G) < y/2\2. (1.1) 

This is first proved in the manifold setting by Cheeger [Chc70] and is extended to undirected graphs by 
Alon and Milman [AM85, AI086]. Cheeger's inequality is an influential result in spectral graph theory with 
applications in spectral clustering [ST07, KVV04], explicit construction of expander graphs [JM85, HLW06, 
Leel2], approximate counting [SJ89, JSV04], and image segmentation [SMOO]. 

We improve Cheeger's inequality using higher eigenvalues of the normalized Laplacian matrix. 
Theorem 1.1. For every undirected graph G and any k>2, it holds that 

A 2 



4>{G) = 0{k)- 



Ai 



This improves Cheeger's inequality, as it shows that A 2 is a better approximation of 0(G) when there is a 
large gap between A 2 and A& for any k > 3. The bound is optimal up to a constant factor for any fc > 2, as 
the cycle example shows that <j>{G) — il(fcA 2 /\/Afc) for any fc > 2. 



1.1 The Spectral Partitioning Algorithm 

The proof of Cheeger's inequality is constructive and it gives the following simple nearly-linear time algorithm 
(the spectral partitioning algorithm) that finds cuts with approximately minimal conductance. Compute the 
second eigenfunction g S R v of the normalized Laplacian matrix £, and let / = g/vd. For a threshold 
let V(t) := {v : f(v) > t} be a threshold set of /. Return the threshold set of / with the minimum 
conductance among all thresholds t. Let 4>(f) denote the conductance of the return set of the algorithm. The 
proof of Cheeger's inequality shows that |A 2 < </>(/) < \/2A 2 , and hence the spectral partitioning algorithm 
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is a nearly-linear time 0(l/v / A2)-approximation algorithm for finding a sparse cut. In particular, it gives a 
constant factor approximation algorithm when A2 is a constant, but since A2 could be as small as 1/n 2 even 
for a simple unweighted graph (e.g. for the cycle), the performance guarantee could be f2(n). 

We prove Theorem 1.1 by showing a stronger statement, that is 4>(f) is upper-bounded by O^kX^/y^k)- 
Theorem 1.2. For any undirected graph G, and k > 2, 

A 2 



4>{f) = 0(k) 

This shows that the spectral partitioning algorithm is a 0(/c/v%T)-approximation algorithm for the sparsest 
cut problem, even though it does not employ any information about higher eigenvalues. In particular, 
spectral partitioning provides a constant factor approximation for the sparsest cut problem when Afc is a 
constant for some constant k. 



1.2 Generalizations of Cheeger's Inequality 

There are several recent results showing new connections between the expansion profile of a graph and the 
higher eigenvalues of its normalized Laplacian matrix. The first result in this direction is about the small set 
expansion problem. Arora, Barak and Steurer [ABS10] show that if there are k small eigenvalues for some 
large k, then the graph has a sparse cut S with |5| ~ n/k. In particular, if k = \V\ e for e G (0, 1), then 
the graph has a sparse cut S with <j>(S) < 0{y/\~k) and |5| ~ n/k. This can be seen as a generalization of 
Cheeger's inequality to the small set expansion problem (see [StelO, OT12, OW12] for some improvements). 

Cheeger's inequality for graph partitioning can also be extended to higher-order Cheeger's inequality for k- 
way graph partitioning [LRTV12, LOT12]: If there are k small eigenvalues, then there are k disjoint sparse 
cuts. Let 

<f>k{G) := min max <f>(Si) 

01 , .. . l<i<k 

where S\, . . . , Sk are over non-empty disjoint subsets Sx,...,Sk C V. Then 

1 



^ k <MG)<o(k 2 )Vh. 

Our result can be applied to fc-way graph partitioning by combining with a result in [LOT12]. 
Corollary 1.3. For every undirected graph G and any I > k > 2, it holds that 



... ,x -^fc 



MG) <0(lk*)- 

V *l 



(ii) For any 5 £ (0, 1), 

0(i-«)fc(G) < O 
(Hi) If G excludes Kh as a minor, then for any 5 £ (0, 1) 



n og 2 fc\ A fe 



S 8 k 



4 



Part (i) shows that is a better approximation of 4>k(G) when there is a large gap between A& and A; 
for any I > k. Part (ii) implies that 4>o.9k(G) < 0(Xk log 2 k/y/X-zk), an d similarly part (iii) implies that 
<A).9fc(G) < 0(\ k /V\2k) for planar graphs. 

Furthermore, our proof shows that the spectral algorithms in [LOT12] achieve the corresponding approxima- 
tion factors. For instance, when A; is a constant for a constant I > k, there is a constant factor approximation 
algorithm for the k-way partitioning problem. 

1.3 Analysis of Practical Instances 

Spectral partitioning is a popular heuristic in practice, as it is easy to be implemented and can be solved 
efficiently by standard linear algebra methods. Also, it has good empirical performance in applications 
including image segmentation [SMOO] and clustering [Lux07], much better than the worst case performance 
guarantee provided by Cheeger's inequality. It has been an open problem to explain this phenomenon 
rigorously [ST07, GM98]. There are some research directions towards this objective. 

One direction is to analyze the average case performance of spectral partitioning. A well-studied model is 
the random planted model [Bop87, AKS98, McSOl], where there is a hidden bisection (X, Y) of V and there 
is an edge between two vertices in X and two vertices in Y with probability p and there is an edge between 
a vertex in X and a vertex in Y with probability q. It is proved that spectral techniques can be used to 
recover the hidden partition with high probability, as long as p — q > fi(y / plog |V|/|V|) [Bop87, McSOl]. 
The spectral approach can also be used for other hidden graph partitioning problems [AKS98, McSOl]. Note 
that the spectral algorithms used are usually not exactly the same as the spectral partitioning algorithm. 
Some of these proofs explicitly or implicitly use the fact that there is a gap between the second and the third 
eigenvalues. See Subsection 4.5 for more details. 

To better model practical instances, Bilu and Linial [BL10] introduced the notion of stable instances for 
clustering problems. One definition for the sparsest cut problem is as follows: an instance is said to be 
7-stable if there is an optimal sparse cut S Q V which will remain optimal even if the weight of each edge is 
perturbed by a factor of 7. Intuitively this notion is to capture the instances with an outstanding solution 
that is stable under noise, and arguably they are the meaningful instances in practice. Note that a planted 
bisection instance is stable if p— q is large enough, and so this is a more general model than the planted random 
model. Several clustering problems are shown to be easier on stable instances [BBG09, ABS10, DLS12], 
and spectral techniques have been analyzed for the stable maximum cut problem [BL10, BDLS12]. See 
Subsection 4.6 for more details. 

Informally, the higher order Cheeger's inequality shows that an undirected graph has k disjoint sparse cuts 
if and only if Aj, is small. This suggests that the graph has at most k — 1 outstanding sparse cuts when 
\k-i is small and A& is large. The algebraic condition that A2 is small and A3 is large seems similar to the 
stability condition but more adaptable to spectral analysis. This motivates us to analyze the performance 
of the spectral partitioning algorithm through higher-order spectral gaps. 

In practical instances of image segmentation, there are usually only a few outstanding objects in the image, 
and so Xk is large for a small k [Lux07]. Thus Theorem 1.2 provides a theoretical explanation to why the 
spectral partitioning algorithm performs much better than the worst case bound by Cheeger's inequality 
in those instances. In clustering applications, there is a well-known eigengap heuristic that partitions the 
data into k clusters if Xk is small and Afc+i is large [Lux07]. Corollary 1.3 shows that in such situations 
the spectral algorithms in [LOT12] perform better than the worst case bound by the higher order Cheeger's 
inequality. 
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1.4 Other Graph Partitioning Problems 



Our techniques can be used to improve the spectral algorithms for other graph partitioning problems using 
higher order eigenvalues. In the minimum bisection problem, the objective is to find a set S with minimum 
conductance among the sets with \V\/2 vertices. While it is very nontrivial to find a sparse cut with exactly 
\V\/2 vertices [FK02, Rac08], it is well known that a simple recursive spectral algorithm can find a balanced 
separator S with (f>(S) = 0(^/e) with \S\ = Q(|V|), where e denotes the conductance of the minimum 
bisection (e.g. [KVV04]). We use Theorem 1.2 to generalize the recursive spectral algorithm to obtain a 
better approximation guarantee when Xk is large for a small k. 

Theorem 1.4. Let 



There is a polynomial time algorithm that finds a set S such that |V|/5 < 15*1 < 4|U|/5 and 4>(S) < 0(ke/Xk)- 

In the maximum cut problem, the objective is to find a partition of the vertices which maximizes the weight 
of edges whose endpoints are on different sides of the partition. Goemans and Williamson [GW95] gave an 
SDP-based 0.878-approximation algorithm for the maximum cut problem. Trevisan [Tre09] gave a spectral 
algorithm with approximation ratio strictly better than 1/2. Both algorithms find a solution that cuts at 
least 1 — 0(y/e) fraction of edges when the optimal solution cuts at least 1 — 0(e) fraction of edges. Using 
a similar method as in the proof of Theorem 1.2, we generalize the spectral algorithm in [Tre09] for the 
maximum cut problem to obtain a better approximation guarantee when X n -k is small for a small k. 

Theorem 1.5. There is a polynomial time algorithm that on input graph G finds a cut (S,S) such that if 
the optimal solution cuts at least 1 — e fraction of the edges, then (S, S) cuts at least 



fraction of edges. 

1.5 More Related Work 

Approximating Graph Partitioning Problems: Besides spectral partitioning, there are approximation algo- 
rithms for the sparsest cut problem based on linear and semidefinite programming relaxations. There is an 
LP-based O(logn) approximation algorithm by Leighton and Rao [LR99], and an SDP-based 0(y/logn) ap- 
proximation algorithm by Arora, Rao and Vazirani [ARV04] . The subspace enumeration algorithm by Arora, 
Barak and Steurer [ABS10] provides an 0(1/Xk) approximation algorithm for the sparsest cut problem with 
running time n°^ k \ by searching for a sparse cut in the (k — 1) -dimensional eigenspace corresponding to 
Ai, . . . , Afc_i. It is worth noting that for k = 3 the subspace enumeration algorithm is exactly the same as 
the spectral partitioning algorithm. Nonetheless, the result in [ABS10] is incomparable to Theorem 1.2 since 
it does not upper-bound 0(G) by a function of A 2 and A 3 . Recently, using the Lasserre hierarchy for SDP 
relaxations, Guruswami and Sinop [GS12] gave an 0(1/ Xk) approximation algorithm for the sparsest cut 
problem with running time n°^2°^ k \ Moreover, the general framework of Guruswami and Sinop [GS12] 
applies to other graph partitioning problems including minimum bisection and maximum cut, obtaining 
approximation algorithms with similar performance guarantees and running times. This line of recent work 
is closely related to ours in the sense that it shows that many graph partitioning problems are easier to 
approximate on graphs with fast growing spectrums, i.e. Afc is large for a small k. Although their results 
give much better approximation guarantees when k is large, our results show that simple spectral algorithms 
provide nontrivial performance guarantees. 



e := 



min 

\S\ = \V\/2 
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Higher Eigenvalues of Special Graphs: Another direction to show that spectral algorithms work well is to 
analyze their performance in special graph classes. Spielman and Teng [ST07] showed that A2 = 0(X/n) for a 
bounded degree planar graph and a spectral algorithm can find a separator of size 0(y / n) in such graphs. This 
result is extended to bounded genus graphs by Kelner [Kcl06] and to fixed minor free graphs by Biswal, Lee 
and Rao [BLR10]. This is further extended to higher eigenvalues by Kelner, Lee, Price and Teng [KLPT11]: 
A& = 0{k/n) for planar graphs, bounded genus graphs, and fixed minor free graphs when the maximum 
degree is bounded. Combining with a higher order Cheeger inequality for planar graphs [LOT12], this implies 
that <pk(G) = 0(y / k/n) for bounded degree planar graphs. We note that these results give mathematical 
bounds on the conductances of the resulting partitions, but they do not imply that the approximation 
guarantee of Cheeger's inequality could be improved for these graphs, neither does our result as these graphs 
have slowly growing spectrums. 

Planted Random Instances, Semi-Random Instances, and Stable Instances: We have discussed some previous 
work on these topics, and we will discuss some relations to our results in Subsection 4.5 and Subsection 4.6. 

1.6 Proof Overview 

We start by describing an informal intuition of the proof of Theorem 1.2 for k = 3, and then we describe 
how this intuition can be generalized. For a function / € R v ', let K(f) = f T Lf /{d\\f\\ ) be the Rayleigh 
quotient of / (see (2.2) of Subsection 2.1 for the definition in general graphs). Let / be a function that is 
orthogonal to the constant function and that TZ(f) ~ A2. 

Suppose A2 is small and A3 is large. Then the higher order Cheeger's inequality implies that there is a 
partitioning of the graph into two sets of small conductance, but in every partitioning into at least three 
sets, there is a set of large conductance. So, we expect the graph to have a sparse cut of which the two 
parts are expanders; see [Tanl2] for a quantitative statement. Since TZ(f) is small and / is orthogonal to the 
constant function, we expect that the vertices in the same expander have similar values in / and the average 
values of the two expanders are far apart. Hence, / is similar to a step function with two steps representing a 
cut, and we expect that TZ(f) ps 4>(G) in this case. Therefore, roughly speaking, A3 » A2 implies A2 ~ 4>{G). 

Conversely, Theorem 1.2 shows that if A2 ~ <fi 2 {G) then A3 sa A2. One way to prove that A2 ~ A3 is to find a 
function /' of Rayleigh quotient close to A2 such that /' is orthogonal to both / and the constant function. 
For example, if G is a cycle, then A2 = 9(l/n 2 ), <j){G) = 0(l/n), and / (up to normalizing factors) could 
represent the cosine function. In this case we may define /' to be the sine function. Unfortunately, finding 
such a function /' in general is not as straightforward. Instead, our idea is to find three disjointly supported 
functions /1, fa, fz of Rayleigh quotient close to A2. As we prove in Lemma 2.3, this would upper-bound A3 
by 2 max{7£(/i), 7?.(/2), IZ(fs)}. For the cycle example, if / is the cosine function, we may construct /1, /2, h 
simply by first dividing the support of / into three disjoint intervals and then constructing each /j by defining 
a smooth localization of / in one of those intervals. To ensure that max{72.(/i), H{f2), T^-(fs)} ~ M we need 
to show that / is a "smooth" function, whose values change continuously. We make this rigorous by showing 
that if A2 ~ cf>(G) 2 , then the function / must be smooth. Therefore, we can construct three disjointly 
supported functions based on / and show that A2 ~ A3. 

We provide two proofs of Theorem 1.2. The first proof generalizes the first observation. We show that if 
\k 3> &A2, then <f>(G) ~ fcA2- The main idea is to show that if A& ^> fcA2, then / can be approximated by a A; 
step function g in the sense that ||/ — g\\ ~ (in general we show that any function / can be approximated 
by a k step function g such that any ||/ — g\\ 2 < !Z(f)/\k). It is instructive to prove that if / is exactly a 
fc-step function then </>(G) < 0(klZ(f)). Our main technical step, Proposition 3.2, provides a robust version 
of the latter fact by showing that for any fc-step approximation of /, <j)(f) < 0(k(TZ(f) + \\f — g\\ \fRJJ))). 

On the other hand, our second proof generalizes the second observation. Say TZ(f) « <fi(G) 2 . We partition 
the support of / into disjoint intervals of the form [2~ 4 , 2~^ +1 ^], and we show that the vertices are distributed 
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almost uniformly in most of these intervals in the sense that if we divide [2-\2~ {i+ ^] into k equal length 
sub-intervals, then we expect to see the same amount of mass in the subintervals. This shows that / is a 
smooth function. We then argue that < k\ 2 , by constructing k disjointly supported functions each of 
Rayleigh quotient 0{k 2 )K{f). 



2 Preliminaries 

Let G = (V, E, w) be a finite, undirected graph, with positive weights w : E — > (0, oo) on the edges. For 
a pair of vertices u, v £ V, we write w(u,v) for w({u,v}). For a subset of vertices S C V, we write 
E(S) := {{u, v} £ E : u, v £ S}. For disjoint sets S, T C V, we write E(S, T) := {{u, v} £ E : u £ S,v £ T}. 
For a subset of edges F C E, we write = J2 e eF w ( e )- We use u ^ v to denote {u, v} £ E. We extend 

the weight to vertices by defining, for a single vertex v w(v) := ^2 U ^ V w(u, v). We can think of w(v) as 
the weighted degree of vertex v. For the sake of clarity we will assume throughout that w(v) > 1 for every 
v € V. For S C V, we write vol(S') = J^ugs u, (' y ) to denote the volume of S 1 . 

Given a subset S 1 C V, we denote the Dirichlet conductance of 5 by 



min{vol(S , ),vol(5')} 

For a function / e R y , and a threshold (el, let V/(t) := {w : /(u) > i} be a threshold set of /. We let 

4>{f) :=mm^(t)). 

be the conductance of the best threshold set of the function /, and Vf(t opt ) be the smaller side (in volume) 
of that minimum cut. 

For any two thresholds t\, t<i € K, we use 

[ti,t 2 ] := {ieK: min{ti,t 2 } < a; < max{(i,t 2 }}. 

Note that all intervals are defined to be closed on the larger value and open on the smaller value. For an 
interval I — [(1,(2] Q we use len(J) := \t\ — t 2 \ to denote the length of /. For a function / € R v , we 
define Vf(I) := {v : f(v) G 1} to denote the vertices within /. The volume of an interval / is defined as 
vol/(7) := vol(V/(/)). We also abuse the notation and use vol/(t) := vo\(Vf(t)) to denote the volume of the 
interval [t, 00]. We define the support of /, supp(/) := {v : f(v) 7^ 0}, as the set of vertices with nonzero 
values in /. We say two functions /, g € R v arc disjointly supported if supp(/) n supp(g) = 0. 

For any t\, t 2l ■ ■ . , U e E, let ip : R -> E be defined as 

i't 1 ,...,t,(x) = argmin ti \x - U\. 
In words, for any x £ E, ipt 1 ,...,t i {x) is the value of ij closest to x. 

For p > 0, we say a function g is p-Lipschitz w.r.t. /, if for all pairs of vertices u, v £ V, 

\g(u)-g(v)\<p\f(u)-f(v)\. 

The next inequality follows from the Cauchy-Schwarz inequality and will be useful in our proof. Let 
01, . . . ,a m ,bi, ...,b m >0. Then, 

i=l 1 2-*i=l ° l 
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2.1 Spectral Theory of the Weighted Laplacian 

We write £ 2 (V, w) for the Hilbcrt space of functions / : V — > K with inner product 

{f,9)w w ( v )f( v )9i v ), 

and norm ||/||^ = (/, f) w . We reserve (•, •) and || • || for the standard inner product and norm on R k , k G N 
and£ 2 {V). 

We consider some operators on £ 2 (V, w). The adjacency operator is defined by Af(v) — J2 u ~v w ( u , w )/( u ), 
and the diagonal degree operator by Df(v) = w(v)f(v). Then the combinatorial Laplacian is defined by 
L = D — A, and the normalized Laplacian is given by 

C G :=I -D-^AD- 1 ' 2 . 

Observe that for a d-regular unweighted graph, we have Cq = \L- 

If g : V — >• M is a non-zero function and / = D~ 1 / 2 g, then 

(g,C G g) (g,D-V 2 LD-V*g) _ (/,£/) ^ _ 

(.9,5) (9,9) {DV*f,DV*f) " 

where the latter value is referred to as the Rayleigh quotient of f (with respect to G). We drop the subscript 
of TZc(f) when the graph is clear in the context. 

In particular, Hq is a positive-definite operator with eigenvalues 

= Ai < A 2 < • • • < A„ < 2 . 

For a connected graph, the first eigenvalue corresponds to the eigenfunction g — D 1 / 2 f ', where / is any 
non-zero constant function. Furthermore, by standard variational principles, 

\ • / (9,£g g) ^ f , 
Afe = mm max < — ; ; — : q £ spam Oi, . . . , qu \ 



mm max 



{ft(/):/espan{/i,... ,/*}}, (2.3) 



where both minimums are over sets of k non-zero orthogonal functions in the Hilbert spaces £ 2 (V) and 
£ 2 (V, w) , respectively. We refer to [Chu97] for more background on the spectral theory of the normalized 
Laplacian. The following proposition is proved in [HLW06] and will be useful in our proof 

Proposition 2.1 (Horry, Linial and Widgerson [HLW06]). There are two disjointly supported functions 
/+, /_ G £ 2 {V, w) such that /+ > and /_ < and K{f + ) < A 2 and K{f-) < A 2 . 

Proof. Let g G £ 2 (V) be the second eigenfunction of C. Let g + G £ 2 (V) be the function with g+(u) = 
max{<?(u), 0} and g_ G £ 2 (V) be the function with g~(u) = mm{g(u), 0}. Then, for any vertex u £ supp(g+), 

(£g+)(u) = g+(u) - > < sf(u) - > = = A 2 • £?(u). 

Therefore, 

(g+,Cg+)= 9+(n) ■ (£g+)(u) < £ \ 2 ■ g+(u) 2 = \ 2 ■ \\g + \\ 2 . 

uSsupp(g + ) uGsupp(g + ) 
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Letting f + =D 1 / 2 g + , we get 



A 2 > <_±i%> = <_±A__^> = n{U 

~ Ils+ll 2 WUWl 



Similarly, we can define /_ = D 1 / 2 g_, and show that TZ(f-) < A2. □ 

By choosing either of / + or /_ that has a smaller (in volume) support, and taking a proper normalization, 
we get the following corollary. 

Corollary 2.2. There exists a function f <E £ 2 (V,w) such that f > 0, 1Z(f) < X2, supp(f) < vol(V)/2, and 

ll/IL = i- 

Instead of directly upper bounding A& in the proof of Theorem 1.2, we will construct k disjointly supported 
functions with small Rayleigh quotients. In the next lemma we show that by the variational principle this 
gives an upper-bound on Afc. 

Lemma 2.3. For any k disjointly supported functions fx, fi, ■ ■ ■ , fk € ^ 2 {V, w), we have 

X k < 2 max U(fi). 

Ki<k 



Proof. By equation (2.3), it is sufficient to show that for any function h e span{/x, . . . , fk}, TZ(h) < 
maxi TZ(fi). Note that TZ(fi) = 1Z(cfi) for any constant c, so we can assume h := Yl%=i fi- Since fx, - ■ ■ ,fk 
are disjointly supported, for any u, v € V, we have 

fc 

\h(u)-h( V )\ 2 <j2mu)-Mv)\ 2 . 

i=l 

Therefore, 

n{h)= Eu~ v ^v)\h(u)-h(v)\ 2 < 2J2^ v j:l 1 w(u,v)\f l (u) - Mv)\ 2 



HI \\h\\ 

2EtiE u _^^)l/ i M™/,WI 2 



< 2 max IZifi 



2.2 Cheeger's Inequality with Dirichlet Boundary Conditions 

Many variants of the following lemma are known; see, e.g. [Chu96] . 

Lemma 2.4. For every non-negative h £ d 2 (V, w) such that supp(/i) < vol(y)/2, the following holds 



□ 



J2 v w{v)h(i 



Proof. Since the right hand side is homogeneous in h, we may assume that max^ h(v) < 1. Let < t < 1 be 
chosen uniformly at random. Then, by linearity of expectation, 



E 



w(E(V h (t),V h (t))) 



T,u~v w ( u > v )\ h ( u ) - h ( v )\ 



E [vol(V h (t))] EvV>(v)Hv) 
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This implies that there exists aO<t<l such that <j)(V h (t)) < S "~"£ ( "^( ) J ) \ ( "j ) The latter holds since 

for any t > 0, vol(V h (t)) < vol(V)/2. ' □ 

2.3 Energy Lower Bound 

We define the energy of a function / e l 2 (V 7 w) as 

£ /: =]>>(^)|/(u)-/(«)| 2 . 

Observe that 11(f) = £ f / \\f\f w . We also define the energy of / restricted to an interval I as follows: 

E f (I) := w ^ v ) len ( J n [/(«). /(«)]) 2 - 

When the function / is clear from the context we drop the subscripts from the above definitions. 

The next fact shows that by restricting the energy of / to disjoint intervals we may only decrease the energy. 

Fact 2.5. For any set of disjoint intervals I\, . . . , I m , we have 

m 

Proof. 

m m 

£, = ^ W (u,v)\f(u) f( V )\ 2 >j2J2 w ( u > v ) len & n [/(«). /Mi) 2 =Ew 

U^V U^V 2—1 2—1 

□ 

The following is the key lemma to lower bound the energy of a function /. It shows that a long interval with 
small volume must have a significant contribution to the energy of /. 

Lemma 2.6. For any non-negative function f e C 2 (V,w) with vol(supp(/)) < vo\(V)/2, for any interval 
I = [a, b] with a > b > 0, we have 

2 (/)-vol^(a)-lcn 2 (/) 
[ ' ~ ^(/)-vol / (a)+vol / (7)' 

Proof. Since / is non-negative with vol(supp(/)) < vo\(V)/2, by the definition of </>(/), the total weight of 
the edges going out the threshold set Vf(t) is at least </>(/) • vol/(a), for any a > t > b > 0. Therefore, by 
summing over these threshold sets, we have 

len(/n [/(«),/(«)]) > len(7) ■ <p(f) ■ vol f (a). 

Let E' := {{u, v} : len(/n [f(u), f(v)}) > 0} be the set of edges with nonempty intersection with the interval 
I. Let j3 e (0, 1) be a parameter to be fixed later. Let F C E' be the set of edges of E' that are not adjacent 
to any of the vertices in I. If w(F) > f3w(E'), then 

£(I) > w(F) ■ lcn(/) 2 > (3 ■ w(E') ■ len(/) 2 > ■ <j>(f) ■ vol/(o) • len(/) 2 . 
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Otherwise, vol/(J) > (1 — j3)w(E'). Therefore, by the Cauchy Schwarz inequality (2.1), we have 

rm ( vi frni;n/m«s > ( E { «, t ,} eg > W K lcn ( J n [/(")> /fo)])) 2 
£(-0 = 2^ w(u,w)(len(/n [/(u), /(«)])) > 1 — — — 

{u,v}EE' ' 

> (l-/3)lcn(J) 2 ^(/) 2 -vDl^(g) 
vol/ (I) 

Choosing j3 — (</>(/) ■ vol/(a))/(0(/) • vol/ (a) + vol/(/)) such that the above two terms are equal gives the 
lemma. □ 



We note that Lemma 2.6 can be used to give a new proof of Cheeger's inequality with a weaker constant; 
see Appendix A. 



3 Analysis of Spectral Partitioning 

Throughout this section we assume that / g £ 2 (V,w) is a non-negative function of norm \\f\\ w = 1 such 
that TZ(f) < A2 and vol(supp(/)) < vol(V)/2. The existence of this function follows from Corollary 2.2. In 
Subsection 3.1, we give our first proof of Theorem 1.2 which is based on the idea of approximating / by a 
2k + 1 step function g. Our second proof is given in Subsection 3.2. 



3.1 First Proof 



We say a function g £ £ 2 (V, w) is a /-step approximation of /, if there exist I thresholds = to < ti < . . . < 
ti—i such that for every vertex v, 

9(v) = VWi, ..,t,-i (/(«))• 
In words, g(v) = ti if U is the closest threshold to f(v); see Figure 3.1 for an example. 



■ a - e -& -o- -o- 

x x „ 



x 

X 

X 

e _ e _0.^j__ o _ e _e_ G . j, 

x x x x 



•x 



■ -o- -o- e - e - 



Figure 3.1: The crosses denote the values of function /, and the circles denote the values of function g. 

We show that if there is a large gap between A2 and then the function / is well approximated by a step 
function g with at most 2k + 1 steps. Then we define an appropriate h and apply Lemma 2.4 to get a lower 
bound on the energy of / in terms of ||/ — g\\ w - One can think of h as a probability distribution function on 
the threshold sets, and we will define h in such a way that the threshold sets that are further away from the 
thresholds to,t\,. . . , t2k have higher probability. 
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Approximating / by a 2k + 1 Step Function 



In the next lemma we show that if there is a large gap between IZ(f) and Afe, then there is a 2k + 1-step 
function g such that \\f — g\\ w — 0(lZ(f)/Ak)- 

Lemma 3.1. There exists a 2k + 1-step approximation of f, call g, such that 

v-A<^. (3-D 

Proof. Let M := max„ f(v). We will find 2k + 1 thresholds =: to < ii < • • ■ < ^2fc = then we let g be 
a 2fc + 1 step approximation of / with these thresholds. Let C := 21Z(f)/k\i c - We choose these thresholds 
inductively. Given to, ti, . . . , t%—\, we let ti-i < ti < M to be the smallest number such that 

]T w(v)\f(v)-i, ti _ liti (f(v))\ 2 = C. (3.2) 
v.ti-i<f(v)<U 

Observe that the left hand side varies continuously with tf. when ti = t^i the left hand side is zero, and 
for larger ti it is non-decreasing. If we can satisfy (3.2) for some t^_i < t, < M, then we let U to be the 
smallest such number, and otherwise we set ti — M. 

We say the procedure succeeds if t 2 k — M. We will show that: (i) if the procedure succeeds then the lemma 
follows, and (ii) that the procedure always succeeds. Part (i) is clear because if we define g to be the 2k + 1 
step approximation of / with respect to to, . . . , tik, then 

2k 

11/ - 9\\l = E E - <Pu-uu(f(v))\ 2 < 2kC = 

and we are done. The inequality in the above equation follows by (3.2). 

Suppose to the contrary that the procedure does not succeed. We will construct 2k disjointly supported 
functions of Rayleigh quotients less than Afe/2, and then use Lemma 2.3 to get a contradiction. For 1 < i < 2k, 
let fi be the following function (see Figure 3.2 for an illustration): 



fi{v) 



\f(v)-i>u-i,u(f(v))\ if ti_i^ /(»)<*< 
otherwise. 



We will argue that at least k of these functions have IZ(fi) < \\k- By (3.2), we already know that the 
denominators of IZ(fi) are equal to C (\\fi\\ w = C), so it remains to find an upper bound for the numerators. 
For any pair of vertices u, v, we show that 



2/,' 



Y,\Mu)-Mv)\ 2 <\f(u)- f(v)f. (3.3) 

i=i 

The inequality follows using the fact that fi, ■ ■ ■ , f 2 k ar e disjointly supported, and thus u, v are contained in 
the support of at most two of these functions. If both u and v are in the support of only one function, then 
(3.3) holds since each fi is 1-Lipschitz w.r.t. /. Otherwise, say u € supp(/i) and v G supp(/j) for i < j, 
then (3.3) holds since 

l/iW-/iW| 2 + l/iW-/ J -(«)| 2 = \f(u)-g(u)\ 2 + \f(v)-g(v)\ 2 

< \f(u)-t t \ 2 + \f(v)-U\ 2 <\f(u)-f(v)\ 2 . 
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10 20 30 40 50 



U rr- 1 ! : 1 ■ [ . ■ . ■ . --"7— ' 



1 X X X V^v" T^^- 1 : 

20 30 40 50 60 



o L^-^ — 



x I - 



20 30 40 50 



Figure 3.2: The figure on the left is the function / with ||/|| u , = 1. We cut / into three disjointly supported 
vectors f\,f%, fz by setting t = 0, t\ « 0.07, t 2 ~ 0.175, and t 3 — max/(u). For each 1 < i < 3, wc define 
fi(v) = mm{|/(u) - *<_x|, |/(«) - if U-i < f(v) < U, and zero otherwise. 



Summing (3.3) we have 



E ^(/o = c £ E w ^ - ^' 2 ^ ?7 E w M - /mi 2 



i=l 



C 



2 ' 



Hence, by an averaging argument, there are k disjointly functions /{,..., fL of Rayleigh quotients less than 
Afc/2, a contradiction to Lemma 2.3. □ 



Upper Bounding <f>(f) Using 2k + 1 Step Approximation g 

Next , we show that we can use any function g that is a 2k + 1 approximation of / to upper-bound 4>{ f) in 
terms of ||/ - g\\ w . 

Proposition 3.2. For any 2k + 1-step approximation of f with \\f\\ = 1, called g, 

cf>(f) < 4kH(f) + 4^2fc ||/ - g\\ w y/U(f). 

Let g be a 2k + 1 approximation of / with thresholds = to < <i < . . . < t2fc> i-e. 3(f) := i>t ,t 1 ,...,t 2k {f{v))- 
We will define a function /i g £ 2 (V,w) such that each threshold set of h is also a threshold set of / (in 
particular supp(/i) = supp(/)), and 



T,u~v W ( U > V )\ h ( v ) - h ( u )\ 



<Akn(f) + AV2k\\f~g\\ w y/TZ(f). 



(3.4) 
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We then simply use Lemma 2.4 to prove Proposition 3.2. 
Let fx : R -> R, 

:= laj-^to,*!,...,^^)]. 

Note that \f(v) — g(v)\ = /j,(f(v)). One can think of /i as a probability density function to sample the 
threshold sets, where threshold sets that are further away from the thresholds to, t\, . . . , tih are given higher 
probability. We define h as follows: 

rf(v) 

h(v) := / n{x)dx 
Jo 

Observe that the threshold sets of h and the threshold sets of / are the same, as h(u) > h(v) if and only if 
f( u ) > f( v )- It remains to prove (3.4). We use the following two claims, that bound the denominator and 
the numerator separately. 

Claim 3.3. For every vertex v, 

Proof. If f(v) = 0, then h(v) — and there is nothing to prove. Suppose f(v) is in the interval f(v) £ 
[t»,ti+i]. Using the Cauchy-Schwarz inequality, 

i-l t-1 

/» = C£(t j+ i - tj) + (/(«) - U)f < 2k ■ - t : f + (/(«) - Uf). 

3=0 j=0 

On the other hand, by the definition of h, 



*-\ rtj+i /■/(«) 
= 7> / (J>(x)dx + / /j,(x)dx 

~ n " t i a n 



where the inequality follows by the fact that f(v) £ [U, □ 

And we will bound the numerator with the following claim. 
Claim 3.4. For any pair of vertices u, v £ V, 

\h(v) - h(u)\ < \\f{v) - f(u)\ ■ (\f(u) - g(u)\ + \f(v) - g(v)\ + \f(v) - f(u)\). 
Proof. By the definition of /x(.), for any x € [f(u),f(v)], 

\ x - g( u ) \ + \x-g{v)\ 



fi(x) < mhx{\x- g(u)\,\x- g(v)\} < 



< 

~ 2 



\ (f}x - f(u)\ + \f(u) g(u)\) + (|s - f(v)\ + \f(v) g(v)\) 



\{\f(u) - g{u)\ + \f(v) - g(v)\ + \f(v) - f(u)\), 
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where the third inequality follows by the triangle inequality, and the last equality uses x € [f(u),f(v)\. 
Therefore, 

/•/(«) 

h(v)-h(u)= / n{x)dx < \f(v)-f(u)\- max u(x) 
Jf(u) xe[f(u),f(v)\ 

< \\m - ■ (i/h - + - + \m - f{u)\). 

□ 

Now we are ready to prove Proposition 3.2. 
Proof of Proposition 3.2. First, by Claim 3.4, 

]T w(u, v)\h(u) - h(v)\ < £ v)\f(v) - f(u)\ ■ (\f(u) - g(u)\ + \f(v) - g(v)\ + \f(v) - f(u)\) 

< \n(f) + ~ w(u, v)\f(v) - f(u)\ 2 Ij2 w(u,v)(\f(u) - g(u)\ + \f(v) - g(v)\y 

y u^v y ur^v 

< \n{f) + \Vnf)- J* E v)(\f(u) - 9 (uW + \f(v) - g{v)\ 2 ) 

y u^v 

where the second inequality follows by the Cauchy-Schwarz inequality. On the other hand, by Claim 3.3, 

V V 

Putting above equations together proves (3.4). Since the threshold sets of h are the same as the threshold 
sets of /, we have </>(/) = 4>(h) and the proposition follows by Lemma 2.4. □ 

Now we are ready to prove Theorem 1.2. 

Proof of Theorem 1.2. Let g be as defined in Lemma 3.1. By Proposition 3.2, we get 

<t>{f) < ikK(f) + AV2k ||/ - g\\ w yfRtfj < AkTZ(f) + 8V2kK(f)/^T k < uV2kTZ(f)/ y/X,,. 

□ 

We provide a different proof of Theorem 1.2 in Appendix B by lower-bounding £f using a 2fc+l approximation 
of /. This proof uses Lemma 2.6 instead of Lemma 2.4 to prove the theorem. 

Remark: Claim 3.4 can be improved to 

\h(v) - h(u)\ < - f(u)\ ■ (\f(u) - g(u)\ + \f(v) - g(v)\ + \\f{v) - /(«)|), 

and thus Theorem 1.2 can be improved to (j>(f) < 10V2kTZ(f)/y/Xk. 
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3.2 Second Proof 



Instead of directly proving Theorem 1.2, we use Corollary 2.2 and Lemma 2.3 and prove a stronger version, 
as it will be used later to prove Corollary 1.3 . In particular, instead of directly upper-bounding we 
construct k disjointly supported functions with small Rayleigh quotients. 

Theorem 3.5. For any non-negative function f G £ 2 (V,w) such that supp(/) < vol(V)/2, and 5 := 
(f) 2 (f)/lZ(f), at least one of the following holds 

i) <Kf) < 0(k)n(f); 

ii) There exist k disjointly supported functions /b/a, • • . ,/fc such that for alll < i < k, supp(/i) C supp(/) 
and 

n(fi) < o{k 2 )n(f)/5. 

Furthermore, the support of each fa is an interval [of, &»] such that |<ij — &j| = Q(l/k)ai. 

We will show that if IZ(f) = Q((f>(G) 2 ) (when 6 = 6(1)), then / is a smooth function of the vertices, in the 
sense that in any interval of the form [t, 2t] we expect the vertices to be embedded in equidistance positions. 
It is instructive to verify this for the second eigenvector of the cycle. 



Construction of Disjointly Supported Functions Using Dense Well Separated Regions 

First, we show that Theorem 3.5 follows from a construction of 2k dense well separated regions, and in 
the subsequent parts we construct these regions based on /. A region R is a closed subset of R+. Let 
£(R) : = J2v.f(v)eR w ( v )f 2 ( v )- We sa y R is w -dense if i(R) > W. For any x G M+, we define 

dist(x,R) := inf l^zM. 

y£R y 

The e -neighborhood of a region R is the set of points at distance at most e from R, 

N e (R) := {iel + : dist(x,i?) < e}. 

We say two regions R2 are e-well- separated, if N e (R\) n N e (R2) = 0. In the next lemma, we show that 
our main theorem can be proved by finding 2k, f2(<5/fc)-dense, Cl(l/k) well-separated regions. 

Lemma 3.6. Let R±, i? 2 , ■ ■ ■ , i?2fc be a set of W -dense and e-well separated regions. Then, there are k 
disjointly supported functions f\ , . . . , fu , each supported on the e-neighborhood of one of the regions such that 

vi<«<*, *(/<)< g^. 

Proof. For any 1 < i < 2k, we define a function /j, where for all v € V, 

fi(v) := /(«) max{0, 1 - dist(/(u), Rj/e}. 

Then, \\fi\\ w > i[Ri)- Since the regions are e-well separated, the functions are disjointly supported. There- 
fore, the endpoints of each edge {u, v} € E are in the support of at most two functions. Thus, by an 



x We note that the first proof can also be modified to obtain this stronger version, without the additional property that each 
fi is defined on an interval [ctj, 6j] of the form \m — bi\ = 0(l/k)ai. See Lemma 4.12 for an adaptation of Lemma 3.1 to prove 
such a statement for maximum cut. 
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averaging argument, there exist k functions /i, f%, . . . , fk (maybe after renaming) satisfy the following. For 
all 1 < i < k, 

2k 



Therefore, for 1 < i < 



j=l u^v 



< 



k ■ mini< i < 2 fe \\fi\\ w 
2Eu^Mu,v)\f(u)-f(v)\ 2 _ 211(f) 



ke 2 W 



ke 2 W ' 



where we used the fact that fj's are 1/e-Lipschitz. Therefore, fi, ■ ■ ■ , fk satisfy lemma's statement. 



□ 



Construction of Dense Well Separated Regions 



Let < a < 1 be a constant that will be fixed later in the proof. For i E Z, we define the interval 
Ii := [a 1 , a l+1 ]. Observe that these intervals partition the vertices with positive value in /. We let £i :— £(Ii). 
We partition each interval Ii into 12k subintervals of equal length, 



L h3 



1 



12k 



1 



(j + !)(!- a) 
12k 



for < j < 12k. Observe that for all i, j, 



len(Ii d ) = 



a*(l - a) 
12k ' 



(3.5) 



Similarly we define £j « := £(1^^). We say a subinterval I^j is heavy, if lij > c^_i/fc, where c > is a 
constant that will be fixed later in the proof; we say it is light otherwise. We use Hi to denote the set of 
heavy subintervals of Ii and Li for the set of light subintervals. We use hi to denote the number of heavy 
subintervals. We also say an interval Ii is balanced if hi > 6k, denoted by Ii € B where B is the set of 
balanced intervals. Intuitively, an interval Ii is balanced if the vertices are distributed uniformly inside that 
interval. 

Next we describe our proof strategy. Using Lemma 3.6 to prove the theorem it is sufficient to find 2k, 0,(5 /k)- 
dense, £1(1/ k) well-separated regions R\ , . . . , i?2fc- Each of our 2k regions will be a union of heavy subintervals. 
Our construction is simple: from each balanced interval we choose 2k separated heavy subintervals and include 
each of them in one of the regions. In order to promise that the regions are well separated, once we include 
Ii,j £ Hi into a region R we leave the two neighboring subintervals hj-i and ijj+i unassigned, so as to 
separate R from the rest of the regions. In particular, for all 1 < a < 2k and all Ii £ B, we include the 
(3a — l)-th heavy subinterval of Ii in R a . Note that if an interval Ii is balanced, then it has 6fc heavy 
subintervals and we can include one heavy subinterval in each of the 2k regions. Furthermore, by (3.5), the 
regions are (1 — a)/12k-wel\ separated. It remains to prove that these 2k regions are dense. Let 



A 



be the summation of the mass of the preceding interval of balanced intervals. Then, since each heavy 
subinterval Ii j has a mass of c5£i-i/k, by the above construction all regions are cA5/fc-dense. Hence, the 
following proposition follows from Lemma 3.6. 
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Proposition 3.7. There are k disjoint supported functions /i, • ■ - ,/fe such that for alll < i < k, supp(/i) C 
supp(/) and 

300k 2 TZ(f) 



VKKfc, K(fi) < 



(l-a) 2 c<5A' 



Lower Bounding the Density 

So in the rest of the proof we just need to lower-bound A by an absolute constant. 
Proposition 3.8. For any interval li ^ B, 

aV(/) 2 ^-i(l-«) 2 



£{U) > 



24(ka 4 (/)(f) + cS) 



Proof. In the next claim, we lower-bound the energy of a light subinterval in terms of Then, we prove 
the statement simply using hi < 6k. 

Claim 3.9. For any light subinterval lij, 

[ l ' 3 > - 144fc(fca 4 </>(/) + cS) ' 

Proof. First, observe that 

4-1= Yl w ( v )f( v ) < « 2l " 2 vol(a l ). (3.6) 
veh-i 

Therefore, 

volfe) = £ «,(«) < £ «;(t,)^ = < < (3.7) 

where we use the assumption that Iij G Li in the second last inequality, and (3.6) in the last inequality By 
Lemma 2.6, 

0(/) 2 ■ vol(q') 2 ■ len(I M ) 2 fca 4 0(/) 2 ■ vol(a') ■ len^,,-) 2 gV(^-i(l - «) 2 
" <H/) ' vol(a l ) + vol(I itj ) ~ ka 4 (j>{f) + cS ~ 144fc(/fca 4 0(/) + cS) ' 

where the first inequality holds by (3.7), and the last inequality holds by (3.5) and (3.6). □ 
Now, since the subintervals are disjoint, by Fact 2.5, 

F(T\> PIT \ > h ^ gViMf > aV(/) 2 l«-i(l-«) 2 

^ ^ 1 lj 144/fc(^(/)a 4 + «5) - 24(fc0(/)a 4 + c8) ' 

where we used the assumption that L is not balanced and thus hi < 6k. □ 

Now we are ready to lower-bound A. 
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Proof of Theorem 3.5. First we show that A > 1/2, unless (i) holds, and then we use Proposition 3.7 to 
prove the theorem. If cj){f) < 10 4 fcft(/), then (i) holds and we are done. So, assume that 

and we prove (ii). Since ||/||?. = 1, by Proposition 3.8, 



a 6 ^/) 2 ^!-^ 2 
24(kcf)(f)a 4 + c6) ' 

Set a = 1/2 and c := a 6 {l - a) 2 /96. If fc0(/)a 4 > cS, then we get 

V/ < 48fcg(/j 1 
^ I_1 " " 2 (l-a)W) " 2' 

where the last inequality follows from (3.8). Otherwise, 

V - 48c^(/) 1 

t " 1 -« 6 (l-«) 2 2 (/)-2' 

where the last inequality follows from the definition of c and 5. Since 

£(V) = ll/llt = x > i1: foll ows from the 
above equations that A > |. Therefore, by Proposition 3.7, we get fc disjointly supported functions fi, ■ ■ ■ , fk 
such that 

300fc 2 ^(/) 10 8 fc 2 ^(/) 2 
UiJ " (1 - a) 2 c5A ~ cp{fY ' 

Although each function fi is defined on a region which is a union of many heavy subintervals, we can simply 
restrict it to only one of those subintervals guaranteeing that TZ(fi) only decreases. Therefore each fi is 
defined on an interval [ffl*,ify] where by (3.5), |a, — = 0(l/fc)dj. This proves (ii). □ 



4 Extensions and Connections 

In this section, we extend our approach to other graph partitioning problems, including multiway partitioning 
(Subsection 4.1), balanced separator (Subsection 4.2), maximum cut (Subsection 4.3), and to the manifold 
setting (Subsection 4.4). Also, we discuss some relations between our setting and the settings for planted 
and semirandom instances (Subsection 4.5) and in stable instances (Subsection 4.6). 



4.1 Spectral Multiway Partitioning 

In this subsection, we use Theorem 3.5 and the results in [LOT12] to prove Corollary 1.3. 

Theorem 4.1 ([LOT12, Theorem 1.3]). For any graph G = (V,E,w) and any integer h, there exist k 
non-negative disjointly supported functions fi,---,fk £ £ 2 (V,w) such that for each 1 < i < k we have 
K(JV) <()(k«)\,,. 
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Let f\, . . . ,fk be as defined above. We consider two cases. First assume that vol(supp(/j)) < vol(V)/2 for 
all 1 < i < k. Recall that V/^iopt) is the best threshold set of Let Si := V/ 4 (i pt). Then, for each function 
fi, by Theorem 3.5, 

Furthermore, since S, C supp(/j) and fx,.. . , fk are disjointly supported, Si,.. . ,S k are disjoint. Hence, 

<f> k {G) = max tf>(Si) < 0(lk 6 )^, 



and we are done. Now suppose there exists a function, say fk, with vol(supp(/fe)) > vol(V)/2. Let Si = 
Vfi(t pt) for 1 < i < k — 1, and S k :— V \ Si \ . . . \ Sk-i- Similar to the above, the sets Si,..., Sk-i are 
disjoint, and <fi(Si) < 0(lk 6 \ k /\f\i) for all 1 < i < k — 1. Observe that 

0(5fc) = w(E(Si,S k )) + ... + w(E(S k -i,S k )) < EtMgCgjjg)) < 0(/fc6) A fc 



vol(F)-vol(^) " Etivol^) " V W 

where the first equality uses vol(Sfe) > vol(V r )/2. Hence, 4>k(G) < 0(lk 6 )X k / 'y/Xl. This completes the proof 
of (i) of Corollary 1.3. 

To prove (ii) we use the following theorem of [LOT12]. 

Theorem 4.2 ([LOT12, Theorem 4.6]). For any graph G = (V,E,w) and S > 0, the following holds: For 
any k > 2, there exist r > (1 — 8)k non-negative disjointly supported functions fx, ■ ■ . , f r G £ 2 {V, w) such that 
for all 1 < i < r, 

K(fi) <0(6- 7 log 2 k)X k . 

It follows from (i) that without loss of generality we can assume that 6 > 10/fc. Let 5' := 5/2. Then, by 
the above theorem, there exist r > (1 — S')k non-negative disjointly supported functions fx, ■ ■ ■ , f r such that 
< 0(<5~ 7 log 2 /c)A fc and vol(supp(/ l )) < vol(F)/2. For each 1 < i < r, let Si := V fz (t opt ). Similar to 
the argument in part (i), since Si C supp(/i), the sets Si,...,S r are disjoint. Without loss of generality 
assume that <p(Si) < <p(^2) < • • • <t>(S r ). Since Si, ... , S(is)k are disjoint, 

0(i- 5)fc (G) < 0(5 (1 _ 4)fc+1 ) < . . . < 0(S r ). (4.1) 

Let m := = 2f/(<ffc). If < 0(m)K{fi) for some (1 - S)k < i < r, then we get 

4>(i-s)k(G) < <f>(Si) = <P{h) < o{ m )n{f % ) < o {^Pj A*, 

and we are done. Otherwise, by Theorem 3.5, for each (1 — S)k < i < r, there exist to disjointly supported 
functions hi x, ■ ■ ■ hi m such that for all 1 < j < to, supp(ftj C supp(/^) and 

S * O fil - O A (4.2, 



0(/j)2 - yppj <Pfx-s)k(G) V <^ 2 / <P 2 { x-8)k(G) 

where the second inequality follows from (4.1). Since fn—5)k+ii •••>/»■ are disjointly supported, all functions 
hij are disjointly supported as well. Therefore, since / = m(5'k) < m(r — (1 — <5)fc), by Lemma 2.3, 



A; < 2 max fc(h itj ) < O 



l 2 \oz 4 k\ X 2 k 



(l-6)k<i<r y w - V <5 16 fc 2 J tfl- S ) k (G)' 

l<j<m y ' 

where the second inequality follows from (4.2). This completes the proof of (ii) of Corollary 1.3. 

Part (iii) can be proved in a very similar way to part (ii). We just exploit the following theorem of [LOT12] 
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Theorem 4.3 ([L0T12, Theorem 3.7]). For any graph G = (V, E, w) that excludes Kh as a minor, and any 
5 G (0, 1), there exists (1 — S)k non-negative disjointly supported functions fx, ... , fns)k such that 

n{fi) < 0{h A s- 4 )x k . 

We follow the same proof steps as in part (ii) except that we upper bound IZ(fi) by 0(h 4 S~ 4 )X k . This 
completes the proof of Corollary 1.3. 

In the remaining part of this section we describe some examples. First we show that there exists a graph 
where <pk(G) > ft(l — k + l)Xk/VX~i- Let G be a union of k — 2 isolated vertices and a cycle of length n. 
Then, 4> k (G) = 9(l/n), X k = 9(l/n 2 ) and for I > k, X l = 6((Z — k + l) 2 /n 2 ). Therefore, 

MQ > n(i-k + i)^= 

V A; 

The above example shows that for I ^ k, the dependency on I in the right hand side of part (i) of Corollary 1.3 
is necessary. 

Next we show that there exists a graph where <fik/2(G) > fl(l/k)Xk/ '\fX~i- Let G be a cycle of length n. Then, 
0fe/ 2 (G) = Q(k/n), A fc = Q(k 2 /n 2 ) and A/ = <d{l 2 /n 2 ). Therefore, 

<f> k/2 (G)>Sl(l/k)^%. 

V *l 

This shows that part (iii) of Corollary 1.3 is tight (up to constant factors) when 8 is a constant. 
4.2 Balanced Separator 

In this section we give a simple polynomial time algorithm with approximation factor O(kfXk) for the 
balanced separator problem. We restate Theorem 1.4 as follows. 

Theorem 4.4. Let 

e := min <P(S)- 

vol(S)=vol(V)/2 

There is a polynomial time algorithm that finds a set S such that -^vo\(V) < vol(S') < ^vol^), and (j>(S) < 
0(ke/X k ). 

We will prove the above theorem by repeated applications of Theorem 1.2. Our algorithm is similar to 
the standard algorithm for finding a balanced separator by applying Cheeger's inequality repeatedly. We 
inductively remove a subset of vertices of the remaining graph such that the union of the removed vertices 
is a non-expanding set in G, until the set of removed vertices has at least a quarter of the total volume. The 
main difference is that besides removing a sparse cut by applying Theorem 1.2, there is an additional step 
that removes a subset of vertices such that the conductance of the union of the removed vertices does not 
increase. The details are described in Algorithm 1. 

Let U be the set of vertices remained after a number of steps of the induction, where initially U — V. We 
will maintain the invariant that 4>g{U) < O(kefXk)- Suppose vol(C7) > Ivol^). Let H — (U,E(U)) be the 
induced subgraph of G on U, and = A^ < A 2 < • • . be the eigenvalues of Ch- First, observe that X' 2 = O(e) 
as the following lemma shows. 

Lemma 4.5. For any set U C V with vo\(U) > |vol(V), let H(U, E(U)) be the induced subgraph of G on 
U. Then the second smallest eigenvalue X' 2 of Ch is at most lOe. 
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Algorithm 1 A Spectral Algorithm for Balanced Separator 



U <- V. 

while vol(U) > f vol(V) do 

Let H = (U, E(U)) be the induced subgraph of G on U, and X' 2 be the second smallest eigenvalue of 
C H - 

Let / e £ 2 (U,w) be a non-negative function such that vol(supp(/)) < vo\{H)/2, and TZh{J) < Aj. 
if <M/) < 0(k)TZ H (f)/VXk then 

U^U\U f (t opt ). 
else 

Let /i, . . . , fk be fe disjointly supported functions such that supp(/i) C supp(/) and 



M/)<0W , UHif l m 

as defined in Theorem 3.5. 

Find a threshold set S — Uf^t) for 1 < i < k, and t > such that 



w(£(S, £/ \ S)) < w(E(S, V \ U)). 



U^U\S. 
end if 
end while 
return U. 



Proof. Let (T, T) be the optimum bisection, and let T := U H T. Since vol(£/) > |vol(F), and vol(T) = 
vol(V)/2, we have 

voIh(T') > vol G (T) - 2vol G (t7) > vol(V)/2 - 2vol(F)/5 = vol(7)/10 = vol(T)/5. 

Furthermore, since E(T', U \ T') C E(T,T), we have 

«>(£(f,t/\r)) w{E{T,T)) 
MT ] = vol fl (T * " 50(T) = 

Therefore, by the easy direction of Cheeger's inequality (1.1), we have A2 < lOe. □ 

To prove Theorem 4.4, it is sufficient to find a set S C {/ with voIh-(S') < |volfj([/) and conductance 
< 0{kX' 2 /X k ) = 0(ke/X k ), because 

a (ttwq\s w ( e g(U,U)) + w(E H {S,S)) - 

4> G (U US) < — — < max(0 G (C7), <Ph{S))) < 0{ke X k ), 

vol G {U) + vol H (S) 

and so we can recurse until |vol(V) < vol(UUS) < |vol(V). Let / £ £ 2 {U, w) be a non-negative function such 
that voljf (supp(/)) < hvolff(U) and TZh(/) < X' 2 , as defined in Proposition 2.1. If 4>n(f) < 0(kX' 2 / Xk), then 
we are done. Otherwise, we will find a set £ such that voljj(iS') < |voljj(Z7) and w(E(S, U\S)) < w(E(S, U)). 
This implies that we can simply remove S from U without increasing the expansion of the union of the 
removed vertices, because 4>g(S U U) < 4>g{U) as the numerator (total weight of the cut edges) does not 
increase while the denominator (volume of the set) can only increase. 

It remains to find a set S with either of the above properties. We can assume that 0h(/) ^ 0(k)TZjj(f) as 
otherwise we are done. Then, by Theorem 3.5, there are k disjointly supported functions /1, . . . , f k € £ 2 (U, w) 
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such that supp(/i) C supp(/) and 

M/) < 0{k) A '- 



^/maxTZ H (ft) 

We extend fa € i 2 {U,w) to /j G £ 2 (V, to) by defining = for « e V —U. We will prove that either 

<j>H(f) < 0(k\' 2 /Xk), or there is a threshold set 5 = Vyv(i) for some 1 < i < k and t > such that 
w(E(S, U \ S)) < w(E(S, U)). As . . ,/fc can be computed in polynomial time, this will complete the 
proof of Theorem 4.4. 

Suppose that for every /, and any threshold set S = V/ ; (i) we have w(E(S, U)) < w(E(S, U \ S)). Then, by 
Lemma 4.6 that we will prove below, IZnifi) > ^C^gCA)) f° r everv 1 < « < fc- This implies that 

M/) < 0(k) X ' 2 =_ < O(fc) A2 _ < 0(fcA 

where the last inequality follows by Lemma 2.3 and the fact that fi, ■ ■ ■ , fk arc disjointly supported. 

Lemma 4.6. For any set U QV, let H(U, E(U)) be the induced subgraph of G on U, and f G i 2 (V, w) be 
a non-negative function such that f{v) = for any v ^ V — U . Suppose that for any threshold set Vf(t), we 
have 

w(E(V f (t),U))<w(E(V f (t),U\V f (t))), 

then 

V&K H (f)>K G (f). 

Proof. Since both sides of the inequality are homogeneous in /, we may assume that max„ f(v) < 1. Fur- 
thermore, we can assume that w ( v )f 2 ( v ) = 1 (this is achievable since we assumed that w(v) > 1 for all 
v G V). Observe that, since wh{v) < wq(v) for all v € U, 

£ w H (v)f 2 (v) < £ w g(v )f(v) = £ ^G(v)f 2 (v) = 1. (4.3) 
veu veu v 

Let < t < 1 be chosen uniformly at random. Then, by linearity of expectation, 



E 



w(E(V f (Vt),U\V f (Vt)))} = £ w(u,v)\f(u)-f(v)\ 

£ ™(u,v)\f(u)-f(v)\\f(u) + f(v)\ 

(u,v)£E(U) 



< / £ W(u,v)\f(u)-f(v)\* / £ «;(«,«)(/(«) +/(v)) a 

y («i,«)eB(t/) y (u,t;)e£;(c/) 

< y/2H H (f). (4.4) 

where the first equality uses the fact that f(v) < 1 for all v £ V, and the last inequality follows by (4.3). 
On the other hand, since w(E(V f (t), U)) < w(E(V f (t), U \ V f (t))) for any t, 



E 



>(E(V f (y/i), U \ V f (Vt)))} > \e [w(E(V f (Vi), V f (V~t))) 

> lY, w ^ v )\f( u )-f( v )\ 2 = l n o(f)- (4.5) 



2 

u^v 



where the last inequality follows by the fact that f(v) > for all v £ V, and the last equality follows by the 
normalization J2 V w ( v )f 2 ( v ) — 1- Putting together (4.4) and (4.5) proves the lemma. □ 
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4.3 Maximum Cut 



In this subsection we show that our techniques can be extended to the maximum cut problem providing a 
new spectral algorithm with its approximation ratio in terms of higher eigenvalues of the graph. 

Let M.g '■= I + D~ x l 2 AD~ 1 / 2 . Observe that M. is a positive semi-definite matrix, and an eigenvector with 
eigenvalue a of M is an eigenvector with eigenvalue 2 — a of C. We use < a\ < ■ ■ ■ < a n < 2 to denote its 
eigenvalues. In this section we analyze a polynomial time approximation algorithm for the Maximum Cut 
problem using the higher eigenvalues of M.. We restate Theorem 1.5 as follows. 

Theorem 4.7. There is a polynomial time algorithm that on input graph G finds a cut (S,S) such that if 
the optimal solution cuts at least 1 — e fraction of the edges, then (S, S) cuts at least 

l_0(fc)log(^) — 
fee a k 

fraction of edges. 

The structure of this algorithm is similar to the structure of the algorithm for the balanced separator problem, 
with the following modifications. First, we use the bipartiteness ratio of an induced cut defined in [Tre09] in 
place of the conductance of a cut. Then, similar to the first proof of Theorem 1.2, we show that the spectral 
algorithm in [Tre09] returns an induced cut with bipartiteness ratio 0{ka\/ ^/uk). Finally, we iteratively 
apply this improved analysis along with an additional step to obtain a cut with the performance guaranteed 
in Theorem 4.7. 

For an induced cut (L, R) such that L U R . ^ 0, the bipartiteness ratio of (L, R) is defined as 

orr m 2w(E(L)) + 2w(E(R))+w(E(LUR,ZUR)) 
^ R) := vol(LUi?) — 

The bipartiteness ratio /3(G) of G is the minimum of f3(L,R) over all induced cuts (L,R). For a function 
/ G £ 2 (V, w) and a threshold t > 0, let L f (t) := {v : f(v) < -t} and R f (t) := {v : f(v) > t} be a threshold 
cut of /. We let 

/?(/) :=mm/3(L f (t),R f (t)) 

be the bipartiteness ratio of the best threshold cut of /, and let (Lf(t opt ),Rf(t opt )) be the best threshold 
cut of /. The following lemma is proved in [Tre09] and the proof is a simple extension of Lemma 2.4. 

Lemma 4.8 ([Tre09]). For every non-zero function h £ £ 2 (V,w), 

R(h ^ < T, u ~v w ( u ' v )\ h ( v ) + h(u)\ 
PW - E v w(v)\h(v)\ 

In this section we abuse the notation and write 7£(/), the Rayleigh quotient of /, as 

">(«,«)!/(«) + /(«)l 2 



J2v w ( v )f( v ) 



This is motivated by the fact that the eigenfunctions of M. are the optimizers of the above ratio. In particular, 
using the standard variational principles and Lemma 2.3, 



Qfc = mm max 

h,...,f k ee 2 (v,w) f^o 



{ft(/):/Gsp a n{/i,...,/ fe }} 



< 2 min max IZ(fi), 

h,...J k £l 2 (V,w) l<i<fc 
disjointly supported 
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where the first minimum is over sets of k non-zero orthogonal functions in the Hilbert space £ 2 (V,w), and 
the second minimum is over sets of k disjointly supported functions in £ 2 (V,w). 

Trevisan [Tre09] proved the following characterization of the bipartiteness ratio in terms of ot\. 
Theorem 4.9 ([Tre09]). For any undirected graph G, 

y < P{G) < V2c7i 

We improve the right hand side of the above theorem and prove the following. 
Theorem 4.10. For any function f £ £ 2 (V,w) and any 1 < k < n, 

(3(f) < 16V2fc-^. 



/a k 

Therefore, letting TZ{f) = a\ implies /3(G) < 0(kct\j 'y/ak). 



The proof of the above theorem is an adaptation of the proof of Theorem 1.2. Let / be the eigenfunction 
corresponding to ct\ with \\f\\ w = 1. The main difference is that here we can not assume / is non-negative. 
In fact most of the edges of the graph will have endpoints of different signs. 

The rest of this section is organized as follows. First we prove Theorem 4.10 in Subsection 4.3.1. Then we 
prove Theorem 4.7 in Subsection 4.3.2. 

4.3.1 Improved Bounds on Bipartiteness Ratio 

We say a function g £ £ 2 (V, w) is a 2k + 1 step approximation of /, if there exists thresholds = to < t\ < 
...< t 2 k such that for any v £ V, 

g(v) = i)-t 2k ,-t 2h - U ...,-tt,o,ti,~,t2 k (f( v ))- 

In words, g(v) is the value in the set {— t 2 k, — tzu-ii ■ ■ ■ i — ti, 0, t\,..., t 2 k} that is closest to f(v). Note that 
here for every threshold t we include a symmetric threshold —t in the step function. The proof of the next 
lemma is an adaptation of Proposition 3.2. 

Lemma 4.11. For any non-zero function f £ £ 2 (V,w) with \\f\\ w = 1, and any 2k + 1-step approximation 

of f, called g, 

(3(f) < AkU(f) + AV2k ||/ - g\\ w y/K(J). 

Proof. Similar to Proposition 3.2, we will construct a function h £ £ 2 (V, w) such that 

22vev w ( v )\ h ( v )\ 

then the lemma follows from Lemma 4.8. Let g be a 2k + 1 step approximation of / with thresholds 
= to < ti < ... < t 2 k- Let fi(x) := \x - ip-t 2k ,...,-t 1 ,o,t 1 ,...,t 2k (x)\- We define h as follows: 

rf(v) 

h(v) := / n(x)dx. 



n 



Note that if f(v) < then h(v) := — J"^ fi(x)dx. First, by Claim 3.3, 
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\Hv)\ > «. (4.6) 

It remains to prove that for every edge (u, v), 

\h(v) + h(u)\ < ±\f(v) + /(u)| • (|/(„) + f(u)\ + \g(v) - f(v)\ + \g(u) - f(u)\). (4.7) 

If f(u) and f(v) have different signs, then using the fact that [i(x) — fi(—x), 

/•/(«) rf(v) /•/(«) 

|/i(u) + = / /j,(x)dx+ n{x)dx = | / ^(ar)dx| < |/(u) + /(u)| ■ max u(a;), 
Jo Jo J-f(v) xe[f(u),-f(v)] 

and thus (4.7) follows from the proof of Claim 3.4 which shows that max ie [j( M ) .-f(v)] ^( x ) < ^(l/( u ) + 
f( v ) \ + l.9( u ) — + \9( u ) ~ f( u )\)- On the other hand, if f(u) and f(v) have the same sign, say that they 
are both positive, then since |/^(x)| < |x| for all x, we get 

/■/(«) /■/(«) 1 

\h(v) + h(u)\< xdx + xdx < -\f(v) + f(u)\ 2 . 

Jo Jo 2 

Putting together (4.6) and (4.7), the lemma follows from a similar proof as in Proposition 3.2. □ 

Theorem 4.10 follows simply from the following lemma, which is an adaptation of Lemma 3.1. 

Lemma 4.12. For any non-zero function f £ ( 2 (V, w) with \\f\\ w = 1, at least one of the following holds: 

i) (3(f) < 8kTZ(f). 

ii) There exist k disjointly supported functions fx, ■ ■ ■ , fk such that for all 1 < i < k, 

Proof. Let M := max„ We find 2k + 1 thresholds = t < ti < . . . < t 2 k — M, and define g to be a 

2k + 1 step approximation of / with respect to these thresholds. Let 



256k 3 K(f)' 

We choose the thresholds inductively. Given to,ii, ■ ■ ■ , U-i, we let ti-\ < U < M be the smallest number 
such that 

]T w(v)\f(v) - ^-u,-u-iU(v))\ 2 + £ «"(«)!/(«) " ^_ 1>tl (/(«))| 2 = C. (4.8) 

o:-ti</(t()<-ti_i !):ti_i</(t))<tj 

Similar to the proof of Lemma 3.1, the left hand side varies continuously with ti, and it is non-decreasing. 
If we can satisfy (4.8) for some fj_i < U < M, then we let U to be the smallest such number; otherwise we 
set U = M. 
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If t2k = M then we say the procedure succeeds. We show that if the procedure succeeds then (i) holds, 
and if it fails then (ii) holds. First, if the procedure succeeds, then we can define g to be the 2k + 1 step 
approximation of / with respect to to, . . . , tik, and by (4.8) we get 



/3 2 Cf) 



\f-g\r w <2kC = 



128k 2 TZ(f) ' 
By Lemma 4.11, this implies that 

(3(f) < ikU(f) + 

and thus part (i) holds. 

If the procedure does not succeed, then we will construct k disjointly supported functions of Rayleigh 
quotients less than 1/kC and that would imply (ii). For each 1 < i < 2k, we let fi be the following function, 



/*(«) := { 



-\f(v) - iP- ti ,- ti _, (/(«)) | if - U < /(«) < -U-i 
\f(v)- </> ti _ 1>t( (/(«))| if^i</(«)<ti 
otherwise. 



We will argue that at least k of these functions satisfy IZ(fi) < 1/kC. By (4.8), we already know that the 
denominators of TZ(fi) are equal to C, so it remains to find an upper bound for the numerators. For each 
pair of vertices u,v £ V, we will show that 



2/v 



^\Mu) + Mv)\ 3 <\nu) + f(v)\ 2 . (4.9) 
Note that u, v are contained in the support of at most two of the functions. We distinguish three cases: 



u and v are in the support of the same function /j. Then (4.9) holds since each fa is 1-Lipschitz. 



• «6 supp(/i) and v G supp(/j) for i ^ j, and /(it), f(v) have the same sign. Then (4.9) holds since 

\fi(u) + Mv)\ 2 + 1/» + m\ 2 = \h{u)\ 2 + \m\ 2 < i/(«)i 2 + i/wi 2 < i/(«) + f(v)\ 2 - 

• u e supp(/j) and v £ supp(/j) for i ^ j, and f(u), f(v) have different signs. Then (4.9) holds by (3.3). 
Summing inequality (4.9), we have 

2k 2k 2 

X>(/i) = ^£ E ™K«)IJ*(") + /^)| 2 <^ E w(u,v)\f(u) + f(v)\ 2 = 256k 37 ^-. 

i=l i=l (n,t)££ " ^ J ' 

By an averaging argument, there are k functions of Rayleigh quotients less than 256fc 2 72. 2 (/)//3 2 (/), and 
thus (ii) holds. □ 



4.3.2 Improved Spectral Algorithm for Maximum Cut 

In this section we prove Theorem 4.7. Our algorithm for max-cut is very similar to Algorithm 1. We 
inductively remove an induced cut such that the union of removed vertices cuts a large fraction of the edges. 
The detailed algorithm is described in Algorithm 2. 
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Algorithm 2 A Spectral Algorithm for Maximum Cut 



U <- V, L <- 0, R <- 0. 
while £([/") ^ do 

Let = (C/, E(U)) be the induced subgraph of G on [/. 

Let / be the first eigenvector of M. . 

if M/) < 192V2kK(f)/a k then 
(L,i?) «- (LUL f (t opt ),RUR f {t opt )). 

else 

Let fi, . . . , fk he k disjointly supported functions such that 

M/) < 16*- 



x /maxi< i < fc 7?. ff (/ l )' 
as defined in Lemma 4.12. 

Find a threshold cut (L',R') = (L f .(t),R n (t)) for 1 < i < k such that 

min(7(L UL',RU R'), j(L UR',RU L')) < j(L, R). 

Remove L' , Rf from U, and let (L, R) he one of (L UL',RU R') or (L U Rf, R U L') with minimum 

uncutness. 
end if 
end while 
return (L,R). 



For technical reasons we define a parameter called uncutness to measure the total weight of cut edges 
throughout the algorithm. For an induced cut (L,R), the uncutness of (L,R) is defined as 

Y(L, R) := w(E(L)) + w(E{R)) + w(E{L U R, ZUR)). 

In words, it is the total weight of the edges adjacent to L and R that are not E(L,R). Note that the 
coefficient of edges inside L and R is one (instead of two as in the definition of bipartiteness ratio) . 

Throughout the algorithm we maintain an induced cut (L,R). To extend this induced cut, we either find 
an induced cut (L',R') in the remaining graph with bipartiteness ratio 0(kTZjj{f)/oik), or an induced cut 
(L f , R') such that j(L UL',R U Rf) < y(L, R). We will show later that this would imply Theorem 4.7. 

Let (L, R) he the cut extracted after a number of steps of the induction, and let U = V \ (L U R) be the set 
of the remaining vertices. Let H = (U,E(U)) he the induced subgraph of G on U and = a\ < ct 2 < . . . 
be the eigenvalues of Mr- Furthermore, assume that w{E(U)) — p ■ w(E(V)) where < p < 1. Since the 
optimal solution cuts at least 1 — e (weighted) fraction of edges of G, it must cut at least 1 — e/p (weighted) 
fraction of the edges of H . Therefore, by Theorem 4.9, 

a'i <2e/p. 

First, if /3 H (f) < 192V2kK H {f)/a k , then we find the best threshold cut (L\ R') = (L f (t opt ),Rf(t opt )) of /, 
and update (L, R) to (L U L' , R U R'), and remove L 1 U R! from H , and recurse. Otherwise, by Lemma 4.12, 
there are k disjointly supported functions fi, ■ ■ ■ , fk such that for all 1 < i < k, 

M/) < is* . nH{f l m 

Next, we show that we can find a threshold cut (L 1 , R 1 ) of one of these functions such that 

min( 7 ( J L U L', R U R'),j(L UR',RU L')) < j(L, R). 
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In words, we can merge (L',R') with the set of removed vertices such that the uncutness of the extended 
induced cut, say (L U L', R U R'), does not increase. To prove this claim we use Lemma 4.13 which will be 
proved below. By Lemma 4.13, if we can not find such a threshold cut for each of the functions /i, • • • , /fe, 
then we must have 

for all 1 < i < k. Henceforth, 

M/) < l 6 fc - = < 96V2fc , a[ < 192V2fc^^. 

^max 1 < l < A .^ i f(/ i ) v /maxi<i</ s 72.^(/ i ) a k 

where the last inequality follows by Lemma 2.3, and the assumption that fi, . ■ ■ , fk are disjointly supported. 
This is a contradiction. Therefore, we can always either find a threshold cut (L',R') of / such that 

/3(L',R') < l92V2k KH ^ < 600/c — , 
ak potk 

or we can remove an induced cut from H while making sure that the uncutness of the induced cut does not 
increase. We keep doing this until E(U) = 0. 

It remains to calculate the ratio of the edges cut by the final solution of the algorithm. Let pj ■ w(E) be the 
fraction of edges in H before the j-th iteration of the for loop for all j > 1, in particular p 1 = 1. 

Suppose the first case holds, i.e. we choose a threshold cut of / with small bipartitness ratio. Then we cut 
at least (1 — 600fce/ pjCtk) fraction of the edges removed from El in the j-th iteration. Since the weight of the 
edges in the j + 1 iteration is pj+iw(E), we can lower-bound the weight of the cut edges by 

(pjw(E) - p i+1 w(E))(l - 600fc ) > w(E) / (1 - 600fc )dr. 

Pi^k J Pj+1 ra k 

Suppose the second case holds, i.e. we choose a threshold cut of one of /i, . . . , fk- Then, since the uncutness 
does not increase, the weight of the newly cut edges in the j-th iteration is at least as large as the total 
weight of the edges removed from H in the j-th iteration. In other words, the total weight of the edges cut 
in the j-th iteration is at least pjiv(E) — pj + iw(E) in this case. 

Putting these together, the fraction of edges cut by Algorithm 2 is at least 

r f i - eoofcJLW = i - 5°°* (l + In (-£-) ) . 

This completes the proof of Theorem 4.7 

Lemma 4.13. For any set U C V, let H(U, E(U)) be the induced subgraph of G on U, and f £ £ 2 (V,w) 
be a non-zero function such that f(v) = for any v ^ U. Also let (L,R) be a partitioning of U . If for any 
threshold cut (Lf(t),Rf(t)), 

mm( lG (LUL f (t),RUR f (t)) llG (LUR f it),RUL f (t)))>- /G (L 1 R), (4.10) 

then 

VwIhU) > K G (f). 

Proof. First, observe that if 

\w(E{L f {t) U R f (t),U)) > w(E(L f (t))) + w(E(R f (t))) + w(E(L f (t) U R f (t),U \ (L f (t) U %(*)))), 
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then (4.10) does not hold for that t. Therefore, if (4.10) holds for any threshold cut (Lt(t),Rf(t)) of /, then 
we have (the weaker condition) that 

±w(E(L f (t)UR f (t),U)) < 2w(E(L f (t))) + 2w(E(R f (t))) + w(E(L f (t)UR f (t),U\(L f (t)UR f (t)))). (4.11) 

Henceforth, we prove the lemma by showing that ^JTZRhJJ) > TZcif) holds whenever (4.11) holds for any 
threshold cut of /. 

Since both sides of (4.11) are homogeneous in /, we may assume that max„ f(v) < 1. Furthermore, we can 
assume that J2 VI£V w(v)f 2 (v) = 1. Observe that, since w H (v) < wg(v) for all v £ U, 

w H (v)f 2 (v) < w G (v)f(v) = Y w G (v)f 2 (v) = 1. (4.12) 

veil v£U v 

Let < t < 1 be chosen uniformly at random. For any vertex v, let Z v be the random variable where 

r i if f( v ) > vt 

Z v = 1-1 if f(v) < -y/t 

[ otherwise. 

Claim 4.14. For any edge {u,v} G E. 

\\f{u) + f(v)\ 2 < E [\Z U + Z V \}< \f(u) + f(v)\(\f(u)\ + \f(v)\). 

Proof. Without loss of generality assume that \f{u)\ < |/(u)|. We consider two cases. 

• If f(u) and f(v) have different signs, then \Z U + Z v \ = 1 when |/(u)| 2 < t < \f(v)\ 2 . Therefore, 

E [\Z V + Z V \] = \f{v)\ 2 \f{u)\ 2 = \f(u) + f(v)\(\f(u)\ + \f(v)\), 
and the claim holds. 

• If f(u) and f(v) have the same sign, then 

'2 if t<\f(u)\ 2 , 
1 if \f(u)\ 2 <t<\f(v)\ 2 , 
if \f{v)\ 2 < t. 

Therefore, 

l -(j{u) + f(v)) 2 < E [\Z U + Z v \] = f(u) 2 + f(v) 2 < (/(«) + f(v)) 2 . □ 

The rest of the proof is very similar to that in Lemma 4.6. 

E[2w(E(L(Vt))) + 2w{E(R(Vt))) + w{E{L(Vt) U R(Vt), U \ (L(Vi) U R(Vi))))} 
Y w{u,v)E[\Z u + Z v \] 

(u,v)£E{U) 

< J2 ™(u,v)\f(u) + f(v)\(\f(u)\ + \f(v)\) 

(u,v)GE(U) 

< I W(u,v)\f(u) + f(v)\2 I Hu,v)(\f(u)\ + \f(v)r 
Y (u,v)£E(U) y (u,v)€E(U) 

< y/2K H (f), (4.13) 
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where the first inequality follows by Claim 4.14, and the last inequality follows by (4.12). On the other hand, 
by (4.11), 

E[2w(E(L(Vi))) + 2w(E(R(Vt))) + w(E(L(Vi) U R(Vi), U \ (L(V~t) U R(Vi))))} 

> ^E[2w(E(L(Vi))) + 2w(E(R(Vi))) + w{E{L(Vi) U R(Vi), V \ (L(Vi) U (ifc/t))))] 

= ^^w(u,v)E[\Z u + Z v \] 

> ^™( M ,i,)|/( U ) + /(«)| 2 = ^ G (/), (4.14) 



where the second inequality follows from Claim 4.14, and the last equality follows from the normalization 
J2 V w ( v )f 2 i v ) = 1- Putting together (4.13) and (4.14) proves the lemma. □ 



4.4 Manifold Setting 

The eigenvalues of a closed Riemannian manifold can be approximated by the eigenvalues of the Laplacian of 
the graph of a e-net in M [Fuj95]. Hence, Theorem 1.2 implies a generalized Chccger's inequality for closed 
Riemannian manifolds. 

Theorem 4.15. Let M be a d- dimensional closed Riemannian manifold. Let A^.(M) be the k th eigenvalue 
of Laplacian of M and 4>{M) be the Cheeger isoperimetric constant of M. Then 

m) <ck^ M) 



where C depends on d only. 



4.5 Planted and Semi- Random Instances 

As discussed in the introduction, spectral techniques can be used to recover the hidden bisection when 
P — q > 57(y / plog |V^|/|y|) in the planted random model [Bop87, McSOl], and for other hidden partition 
problems [AKS98, McSOl]. Some semi-random models have been proposed and the results in planted ran- 
dom models can be generalized using semidefinite programming relaxations [FK01, MMV12]: Feige and 
Kilian [FK01] considered the model where a planted instance is generated and an adversary is allowed to 
delete arbitrary edges between the parts and add arbitrary edges within the parts, and they proved that 
an SDP-based algorithm can recover the hidden partition when p — q > 0(y / plog |^|/|V|). Makarychev, 
Makarychev and Vijayaraghaven [MMV12] considered a more flexible model where the induced subgraph of 
each part is arbitrary, and proved that an SDP-based algorithm would find a balanced cut with good quality. 
These results show that SDP-based algorithms are more powerful than spectral techniques for semi-random 
instances. 

For graph bisection, we note that there will be a gap between A2 and A3 in the instances in the planted random 
model when p — q is large enough. Theorem 1.2 shows that the spectral partitioning algorithm performs 
better in instances just satisfying this "pseudorandom" property, although the bounds are much weaker 
when applied to random planted instances. For example, our result implies that the spectral partitioning 
algorithm performs better in the following "deterministic" planted instances where there are two arbitrary 
bounded degree expanders of size \V\/2 with an arbitrary bounded degree sparse cut between them. 
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Corollary 4.16. Let G — (V, E) be an unweighted graph such that V = A\J B . where vol(A) = vo\(B) and 
4>{A) = 4>{B) = 0. Let Ga and Gb be the induced subgraphs of G on A and B, and ip = min(0(G a) , 0(G_b))- 
Suppose that the minimum degree in Ga and Gb is at least di, and the maximum degree of the bipartite 
subgraph G' = (AU B, E{A 1 B)) is at most d 2 . Then the spectral partitioning algorithm applied to G returns 
a set of conductance 

< di + d 2 



O 



Proof. We call {S 1 ,S 2 ,S 3 } & 3-partition of V if Si,S2, S 3 are disjoint and Si U S 2 U 5 3 = V. We will show 
that any 3-partition of V contains a set of large conductance. This implies that 03(G) is large, and thus A3 
is large by the higher-order Cheeger's inequality. Then Theorem 1.2 will prove the corollary. 

Given a 3-partition, let S be the set of smallest volume, then vol (5) < vol(V)/3. We show that 

Let m = \E(S) n E(A,B)\ be the number of induced edges in S that cross A and B. Then \S\ > 2m/d2, 
since the total degree of S in G' is at least 2m but the maximum degree in G' is at most d 2 - Observe that 

> \E(SnA,A-SnA)\ + \E(SnB,B-SnB)\ 

> y> GA (SDA)- vo\(S n A) + ^</> Gb (SDB)- vo\(S n B) 

> |(vol(5) - 2m), 

where the second inequality follows by the fact that vol(S') < |vol(A) = |vol(B), and the last inequality 
follows by iP(Ga) > <f and 4>{Gb) > <P- Therefore, 

, / o\ = \ E ( S > s )\ > v • ( v °K s ) - 2m ) = <p _ m ■ v > <p _ m ■ v > t_ v = f_ dl 

n ' vo\(S) - 2vol(5) 2 vol(5) ~ 2 2m + di|5|-2 2 + 2d 1 /d 2 2d 1 +d 2 ' 

where the last inequality uses the fact that |5| > 2m/d 2 . This proves (4.15). Therefore, 03(G) > ipd\/2{d\ + 
d 2 ). But by the higher order Cheeger's inequality, 03(G) = 0(\ /r X^). Therefore, Theorem 1.2 implies that 
the spectral partitioning algorithm returns a set of conductance 

0( 4L) = o(^^±^). 

□ 



We note that the degree requirements on d\ and d 2 are necessary. Otherwise, the bipartite graph G' may only 
contain a heavy edge (with weight • vol(V)/2) connecting u £ A and v € B where dc A (u) — da B {v) = 1. 
Then A — {it}, B — {v}, {u, v} are all sparse cuts and A3 ~ A2, and Theorem 1.2 would not apply. 

Corollary 4.16 implies that the spectral partitioning algorithm is a constant factor approximation algorithm 
for planted random instances. Let G = (A U B, E) be a graph such that \A\ = \B\ = |V|/2, where each 
induced edge in A and each induced edge in B appears with probability p and each edge crossing A and 
B appears with probability q. Suppose p > q > f2(lnn/n), then with high probability vol(A) vol(-B), 
0(A) sa cf)(B) ss q/{p + q) and 4>{G A ) ~ 4>{G B ) w 0(1). Putting the parameters w q/(p + q), ip w 6(1), 
di w pn, c?2 ~ <?n, Corollary 4.16 implies that the spectral partitioning algorithm returns a set of conductance 
0(q/ P ). 
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4.6 Stable Instances 



Several clustering problems are shown to be easier on stable instances [BBG09, ABS10, DLS12], but there 
are no known results on the stable sparsest cut problem. As discussed earlier, the algebraic condition that 
A 2 is small and A3 is large is of similar flavour to the condition that there is a stable sparse cut, but they do 
not imply each other. On one hand, using the definition of stability in the introduction, one can construct 
an instance with an fi(n)-stable sparse cut but the gap between A2 and A 3 is 0(l/n 2 ): Suppose the vertices 
are {l,...,2n}. There is an odd cycle, 1-3-5-7-9-.. .-(2n — 1)-1 where each edge is of weight 1. There is an 
even cycle 2-4-6-8-10-.. .-2n-2 where each edge is of weight 1. There is an edge between 2i — l and 2i for each 
1 < i < n, where each edge is of weight c/n 2 for a constant c. Then the optimal cut is the set of odd vertices 
with conductance 1/n 2 , and this is an O(n) stable sparse cut. But the second eigenvector and the third 
eigenvector will be the same as in the cycle example (if c is a large enough constant), where the vertices are 
in the order 1, 2, 3, 2n and the Rayleigh quotients are of order 1/n 2 . 

On the other hand, it is not hard to see that an instance with a large gap between A2 and A3 is not necessarily 
1-stable, because there could be multiple optimal sparse cuts. A more relaxed stability condition is that any 
near-optimal sparse cut is "close" to any optimal solution. More precisely, we say a cut (S, S) is e-closed to 
an optimal cut (T, T) if the fraction of their symmetric difference 8 = vol(5AT)/vol(V) satisfies 8 < e or 
8 > 1 — e. We call an instance to the sparsest cut problem (c, e)-stable if any c-approximation solution is 
e-close to any optimal solution. It is possible to show that if A2 is small and A3 is large then the instance is 
stable under this more relaxed notion. 

Corollary 4.17. Any instance to the sparsest cut problem is (c, ©(CA2/ Ag' ' 2 '))- stable for any c > 1. 

Proof. Let {T,T) be an optimal cut with vol(T) < vo\(V)/2 and <f> = <fi(T). Suppose the instance is not 
(c, e)-stablc. Then there exists a cut (S,S) of conductance at most ccf) and vol(S') < vol(V^)/2 such that 
vol(SAT)/vol(V) € [e, 1 - e]. Let Si be S - T or T - S, whichever of larger volume. Let S 2 be S n T or 
V — S — T, whichever of larger volume. Then, by our assumption, we have vol(Si) > e • vol(V)/2 for i = 1, 2. 
Also, for i = 1,2, 

w{E(Si,Si)) < w{E{S,S))+w{E(T,T)) < 4> ■ vol(T) + c<j) ■ vol(S) < (1 + c)4> ■ vol(F)/2. 

Therefore 0(Si) < (1 + c)4>/e. Finally, observe that S3 :— V — S\ — S2 is one of these four sets: T, S, T, S. 
This implies that ^(S'3) < cip/e. Thus, 




where the last inequality follows from Theorem 1.2. Therefore e = 0(cA2/Ag^ 2 ). □ 

There is also another interpretation of our result through numerical stability. By the Davis-Kahan theorem 
from matrix perturbation theory (see [Lux07]), when there is a large gap between A2 and A 3 , then the second 
eigenvector is stable under perturbations of the edge weights of the graph. More generally, when there is 
a large gap between Afc and A^+i, then the top fc-dimensional eigenspace is stable under perturbations of 
the edges weights of the graph. Our result shows that spectral partitioning performs better when the top 
eigenspace is stable. Some similar results are known in other applications of spectral techniques [AFKMS01, 
LuxlO]. 
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A A New Proof of Cheeger's Inequality 

We will use Lemma 2.6 to derive Cheeger's inequality with a weaker constant. The proof is a simplified 
version of our second proof of Theorem 1.2. By Proposition 2.1, we assume that we are given a non-negative 
function / G £ 2 (V,w) with K(f) < A 2 and vol(supp(/)) < vol(V)/2 and ||/|| w = 1. Fix a G (0,1). Let 
h = [a\a i+1 ]. By Lemma 2.6, 

2 (/)-volV)-lcn 2 (/ t ) 2 (/).volV)-km 2 (/ I ) = ^ 2 (J) ■ vol 2 (q') ■ q 2 '(l - q) 2 
Uj - 4>(f) ■ vol(a l ) +vol(ii) ~ vol(a?) + vol(/i) vol^ 1 ) 
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Summing over all intervals, we have 



b 2 (f) -vol 2 (a l ) -a 2t (l-a) 2 2 2 vol 2 (a l ) • a il 



f/>2^f(/i)>l, ^TT^Tn = (/)•(!" «) 



vol(a* +1 ) vol(a l+1 ) • a 2 * 

9, , , ^(E^volfa^-a 24 ) 2 
2^ vol(a l+i ) ■ a" 14 

= </> 2 (/) • (l-a) 2 a 2 ^vol(a l ) -a 21 , 

i 

where the third inequality follows from (2.1). Changing the order of the summation, 

5>l(aV<=5>l(J,) E = iZ^E^i)^ ^ 13^2' 
where the last inequality holds by the assumption that ||/|L = 1. Therefore, 

f / >0 2 (/).(l-a) 2 a 2 I ^ = 2 (/)i^a 4 . 
Setting a = (VT7 - l)/4, we get 0(/) < 4.68^- 

B A Different Proof of Theorem 1.2 

In this section we give a different proof of Theorem 1.2. In particular, given a function 3 that is a 2/c + 1 
step approximation of /, we lower bound 7Z(f) — £f using Lemma 2.6. This gives Proposition B.2 which 
can be seen as a weaker version of Proposition 3.2. 

Corollary B.l. If £f < \k/(C 2 k 2 ) for some constant C , then for any function g G I (V, w) satisfying (3.1) 7 



2 



ll</ll;, - 



2 



Proof. The statement follows from a simple application of the triangle inequality: 



2 



125/ \ . A 4 



2 



\\9\C>(\\fL-\\f-9\\ w y> ^-V^fJ "l 1 r,v 

where the second inequality follows by (3.1). □ 
Proposition B.2. for any 2k + 1-step approximation of f, called g, 

g/ >min{^"^, W)NlM. 

\ 32fc '2048fc 2 ||/- ff || 2 y j 

Proof. Assume that range (<?) = {to,£i, • • ■ ,^2fc} such that = to < t\ < . . . < t^k- For each 1 < i < 2k, we 
let li be the middle part of the interval i.e., 

T 3£i_i+£i ti—i+ti 
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Let mi := + U)/2 be the midpoint of Ii, and let m,2k+i '■= oo. Since the intervals are disjoint, by 

Fact 2.5 we can write 

F >f ff n>f 02(/) ' V0l2(TOl) ' lcn2(/l) - 1 2 (/)-vol 2 (m J ).(t 4 -t t _ 1 ) 4 



/-I ! ' " ' ' " ' " 

2 



^ 4>{f) • vol(mi) + vol(Ii) 16 ^ • vol(mi) • {U - t^f + vol{I t ) ■ (U - U-i) s 



I 4> 2 {f) (Ei=i vol(mi)(*i - n-ij 

~ 16 0(/) ESi volKXti - i^) 2 + E?=i vol^)^ - t^) 2 : 



(B.l) 

where the second inequality follows by applying Lemma 2.6 to each interval Ii, and the third inequality 
follows from (2.1). Now to prove the proposition we simply use the following two claims. 



Claim B.3. 
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Proof. Since g is a 2k + 1 approximation of /, for any vertex v such that f(v) G Ii, 

\M-g(v)\>*^±. 

Therefore, 

ii/ - = E - a(v)\ 2 > E E - s(«)i a > ^ E vol wfe - ^-i) 2 - 

•u i=l v:f(v)£li i=l 

□ 

Claim B.4. 
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Evolfe)^-^-!) 2 > 



2fe 



Proof. The claim follows simply from changing the order of summations: 

2fc 2fe 2fc 2fc i 

Evol(mi)(*i - t,_i) 2 = ^(tj - t 4 _i) 2 ^(vol(mj) - vol(m,-+i)) = ^(vol(mj) - vol(m l+ i)) ^T(tj - tj-if 

z=l z— 1 j— i 2=1 _7= 1 

2k II II 2 

> E( vol (^) - vol(m l+1 ))^ = ™. 
1=1 

where the first inequality follows from the Cauchy-Schwarz inequality, and the last equality follows by the 
fact that for all vertices v we have g(v) — U when m, < f(v) < m%+\. □ 

By (B.l) and the above claims, we have 
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£f > ?r, — ir > m i n \ ~i = — o 

160(/)E 2 =iVol(™ l )(i t -^-i) 2 + 256||/-.g|| 2 u \ 64fc 2048P ||/ - 5 || 2 



□ 
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Proof of Theorem 1.2. Let g be as defined in Lemma 3.1. If A2 > 2 56k 2 > then by Cheeger's inequality, 

32fcA 2 



<f>(f) < V2A 2 < 



/AT ' 



and we are done. Otherwise, by Corollary B.l, we have ||g||^ > 1/2. Therefore, by Proposition 3.2, we have 



<Kf)\\g\\l W)llgt \ > <P 2 if) > A fe W) 

32fc ' 2048fc2 ||/ _ g \\l j ~ 2 ™h* ||/ - g\\l - W 5 k 2 £f ' 



where the last inequality follows by Lemma 3.1. Now the theorem follows from the fact that £f = TZ(f) < A 2 . 

□ 
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